Skip to content

Feature/add prometheus and grafana#8357

Open
NazarVavrushchak wants to merge 5 commits intodevfrom
feature/Add-prometheus-and-grafana
Open

Feature/add prometheus and grafana#8357
NazarVavrushchak wants to merge 5 commits intodevfrom
feature/Add-prometheus-and-grafana

Conversation

@NazarVavrushchak
Copy link
Contributor

@NazarVavrushchak NazarVavrushchak commented Apr 24, 2025

Link Issue
8077

Summary by CodeRabbit

  • New Features

    • Introduced comprehensive application monitoring and health checks, including metrics for active users, login events, CPU usage, memory, garbage collection, HTTP requests, and response times.
    • Added REST endpoints and actuator integration for exposing metrics and health information.
    • Enabled Prometheus and Grafana integration for real-time monitoring via Docker Compose.
    • Added custom health indicators for database, disk space, memory usage, habit tracking, notification service, and user engagement.
  • Bug Fixes

    • None.
  • Documentation

    • None.
  • Tests

    • Added extensive unit tests for all new metrics and health indicator components to ensure reliability and correctness.
  • Chores

    • Updated configuration files and dependencies to support monitoring and health features.
    • Added Prometheus configuration and Docker Compose setup for monitoring stack.
  • Refactor

    • None.
  • Style

    • None.
  • Revert

    • None.

@coderabbitai
Copy link

coderabbitai bot commented Apr 24, 2025

Walkthrough

This pull request introduces comprehensive monitoring and health-check capabilities to the application. It adds Prometheus and Grafana integration via Docker Compose, configures Prometheus scraping, and enables Spring Boot Actuator endpoints. The core and service modules gain new dependencies for Micrometer and Actuator. Several new metrics filters and health indicators are implemented, tracking CPU, memory, garbage collection, request counts, response times, and user activity. Health indicators for database, disk, memory, notifications, user engagement, and habit tracking are included. Extensive unit tests cover all new metrics and health components. Security configuration is updated to allow actuator endpoint access.

Changes

Files/Paths Change Summary
core/pom.xml, service/pom.xml Added Micrometer Prometheus registry and Spring Boot Actuator dependencies.
core/src/main/resources/application.properties Enabled actuator endpoints, set actuator base path, and configured health details.
docker-compose.yml, prometheus.yml Introduced Docker Compose for Prometheus & Grafana; added Prometheus scraping config.
core/src/main/java/greencity/config/SecurityConfig.java Injected metrics filters, permitted actuator endpoints, updated filter chain for metrics tracking.
core/src/main/java/greencity/controller/MetricsController.java Added new REST controller for recording login events to metrics.
core/src/main/java/greencity/dto/metric/LoginEventDto.java Introduced DTO for login event data.
core/src/main/java/greencity/metrics/ActiveUsersInMemoryMetrics.java New filter for tracking active users and login events in memory, exposes metrics via Micrometer.
core/src/main/java/greencity/metrics/CPUMetrics.java New component for CPU usage metrics, scheduled updates, and gauge registration.
core/src/main/java/greencity/metrics/GarbageCollectionMetrics.java New component for tracking JVM garbage collection time/count over a rolling window.
core/src/main/java/greencity/metrics/MemoryMetrics.java New component for JVM memory usage metrics, with scheduled updates and gauges.
core/src/main/java/greencity/metrics/RequestCountMetrics.java New filter for counting HTTP requests, successes, failures, Prometheus requests, and error rates.
core/src/main/java/greencity/metrics/ResponseTimeMetrics.java New filter for measuring HTTP response times (min, max, avg) via Micrometer.
dao/src/main/java/greencity/repository/NotificationRepo.java Added method to count notifications since a given time.
dao/src/main/java/greencity/repository/UserRepo.java Added method to count users registered after a specific date.
service-api/src/main/java/greencity/constant/ErrorMessage.java Added constant for null request/response error message.
service/src/main/java/greencity/health/DatabasePerformanceHealthIndicator.java New health indicator for database connectivity and latency, with metrics.
service/src/main/java/greencity/health/DiskSpaceHealthIndicator.java New health indicator for disk space, with metrics and threshold logic.
service/src/main/java/greencity/health/HabitTrackingHealthIndicator.java New health indicator for habit tracking activity, with metrics and thresholds.
service/src/main/java/greencity/health/MemoryUsageHealthIndicator.java New health indicator for JVM memory usage, with metrics and thresholds.
service/src/main/java/greencity/health/NotificationServiceHealthIndicator.java New health indicator for notification activity, with metrics and thresholds.
service/src/main/java/greencity/health/UserEngagementHealthIndicator.java New health indicator for user registration activity, with metrics and thresholds.
core/src/test/java/greencity/controller/MetricsControllerTest.java Added unit tests for MetricsController's login event recording.
core/src/test/java/greencity/metrics/ActiveUsersInMemoryMetricsTest.java Added unit tests for active user and login metrics tracking.
core/src/test/java/greencity/metrics/CPUMetricsTest.java Added unit tests for CPU metrics component.
core/src/test/java/greencity/metrics/GarbageCollectionMetricsTest.java Added unit tests for garbage collection metrics component.
core/src/test/java/greencity/metrics/MemoryMetricsTest.java Added unit tests for memory metrics component.
core/src/test/java/greencity/metrics/RequestCountMetricsTest.java Added unit tests for HTTP request count metrics filter.
core/src/test/java/greencity/metrics/ResponseTimeMetricsTest.java Added unit tests for response time metrics filter.
service/src/test/java/greencity/health/DatabasePerformanceHealthIndicatorTest.java Added unit tests for database health indicator.
service/src/test/java/greencity/health/DiskSpaceHealthIndicatorTest.java Added unit tests for disk space health indicator.
service/src/test/java/greencity/health/HabitTrackingHealthIndicatorTest.java Added unit tests for habit tracking health indicator.
service/src/test/java/greencity/health/MemoryUsageHealthIndicatorTest.java Added unit tests for memory usage health indicator.
service/src/test/java/greencity/health/NotificationServiceHealthIndicatorTest.java Added unit tests for notification service health indicator.
service/src/test/java/greencity/health/UserEngagementHealthIndicatorTest.java Added unit tests for user engagement health indicator.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant MetricsController
    participant ActiveUsersInMemoryMetrics

    User->>MetricsController: POST /metrics/recordLogin (LoginEventDto)
    MetricsController->>ActiveUsersInMemoryMetrics: recordLogin(email, loginTime)
    ActiveUsersInMemoryMetrics-->>MetricsController: (updates login metrics)
    MetricsController-->>User: HTTP 200 OK
Loading
sequenceDiagram
    participant HTTPClient
    participant SecurityConfig
    participant RequestCountMetrics
    participant AccessTokenAuthenticationFilter
    participant ActiveUsersInMemoryMetrics

    HTTPClient->>SecurityConfig: HTTP Request
    SecurityConfig->>RequestCountMetrics: addFilterBefore
    SecurityConfig->>ActiveUsersInMemoryMetrics: addFilterAfter
    RequestCountMetrics->>AccessTokenAuthenticationFilter: (proceed filter chain)
    AccessTokenAuthenticationFilter->>ActiveUsersInMemoryMetrics: (proceed filter chain)
    ActiveUsersInMemoryMetrics-->>HTTPClient: (response)
Loading

Possibly related PRs

Suggested labels

back-end

Suggested reviewers

  • maks741
  • oleksandra-bulhakova
  • RomaMocherniuk
  • VasylyshynDmytro

Poem

Metrics bloom and health checks sing,
Prometheus and Grafana bring
Insights deep and numbers bright,
Filters track each user's flight.
With tests in tow and code so neat,
Our system's pulse is now complete!
🚦📊🩺

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 25

🧹 Nitpick comments (28)
dao/src/main/java/greencity/repository/NotificationRepo.java (1)

195-195: Add Javadoc for new repository method.

The new countByTimeAfter method follows Spring Data JPA naming conventions and will work correctly. To maintain consistency with the rest of the interface, please add Javadoc documentation for this method.

+/**
+ * Counts the number of notifications created after the specified date.
+ *
+ * @param date the date after which to count notifications
+ * @return the count of notifications created after the specified date
+ */
 long countByTimeAfter(ZonedDateTime date);
core/src/main/resources/application.properties (1)

104-109: Consider limiting exposed actuator endpoints.

The Prometheus configuration looks good, but exposing all management endpoints (*) could potentially expose sensitive information or operations. Consider restricting this to only the endpoints you actually need.

#Prometheus
-management.endpoints.web.exposure.include=*
+management.endpoints.web.exposure.include=health,info,prometheus,metrics
#Optional path
management.endpoints.web.base-path=/actuator
#Also optional
management.endpoint.health.show-details=always
core/src/test/java/greencity/controller/MetricsControllerTest.java (1)

46-83: Great edge case coverage, consider adding exception handling tests.

These tests do an excellent job covering various null and empty input scenarios. However, I recommend adding tests for exception handling scenarios to ensure robustness.

Consider adding a test like this:

@Test
void recordLoginWhenServiceThrowsExceptionTest() {
    // Given
    doThrow(new RuntimeException("Service failure")).when(activeUsersInMemoryMetrics)
        .recordLogin("error@example.com", 1234567890L);
    
    LoginEventDto errorDto = new LoginEventDto("error@example.com", 1234567890L);
    
    // When/Then
    assertThrows(RuntimeException.class, () -> metricsController.recordLogin(errorDto));
    
    verify(activeUsersInMemoryMetrics).recordLogin("error@example.com", 1234567890L);
    verifyNoMoreInteractions(activeUsersInMemoryMetrics);
}
core/src/main/java/greencity/dto/metric/LoginEventDto.java (1)

7-13: Consider adding validation annotations to the fields.

This DTO looks well-structured with appropriate Lombok annotations. To enhance robustness, consider adding validation annotations like @Email for the email field and @NotNull for both fields. Also, adding documentation to clarify the expected format of loginTime (e.g., milliseconds since epoch) would improve maintainability.

+import javax.validation.constraints.Email;
+import javax.validation.constraints.NotNull;

 @Data
 @AllArgsConstructor
 @NoArgsConstructor
 public class LoginEventDto {
+    @NotNull
+    @Email
     private String email;
+    @NotNull
     private Long loginTime;
 }
service/src/main/java/greencity/health/MemoryUsageHealthIndicator.java (1)

18-18: Make memory threshold configurable.

Consider making the memory usage threshold configurable through a property rather than hardcoding it. This allows for environment-specific tuning without code changes.

-    private static final double MAX_MEMORY_USAGE_PERCENTAGE = 80.0;
+    @Value("${health.memory.max-usage-percentage:80.0}")
+    private double maxMemoryUsagePercentage;

Don't forget to add the import:

import org.springframework.beans.factory.annotation.Value;
service/src/main/java/greencity/health/DatabasePerformanceHealthIndicator.java (1)

36-37: Make database type configurable rather than hardcoded

The health indicator hardcodes "PostgreSQL" as the database type, which may not be accurate if a different database is used. Consider making this configurable or determining it dynamically.

+   private String determineDatabaseType(Connection connection) {
+       try {
+           return connection.getMetaData().getDatabaseProductName();
+       } catch (Exception e) {
+           logger.warn("Could not determine database type: {}", e.getMessage());
+           return "Database";
+       }
+   }

@Override
public Health health() {
    try (Connection connection = dataSource.getConnection();
        Statement statement = connection.createStatement()) {
        long startTime = System.currentTimeMillis();
        statement.execute("SELECT 1");
        long latencyMs = System.currentTimeMillis() - startTime;

        meterRegistry.gauge("app_database_health", 1);
        return Health.up()
-           .withDetail("database", "PostgreSQL is reachable")
+           .withDetail("database", determineDatabaseType(connection) + " is reachable")
            .withDetail("latencyMs", latencyMs)
            .build();
core/src/main/java/greencity/metrics/CPUMetrics.java (1)

3-4: Consider adding a note about the use of com.sun.management package

The com.sun.management package is an Oracle-specific API that might not be available in all JVM implementations, which could be a portability concern. Consider adding a comment to document this dependency and potential alternatives.

+/**
+ * Monitors CPU usage metrics using Micrometer.
+ * Note: This class depends on com.sun.management.OperatingSystemMXBean which is
+ * an implementation-specific API that might not be available in all JVMs.
+ */
@Component
public class CPUMetrics {
service/src/test/java/greencity/health/UserEngagementHealthIndicatorTest.java (1)

41-43: Mocking gauge() is redundant and may conceal signature mismatches

MeterRegistry#gauge(String, Number) returns the same Number instance back and its return value isn’t used inside health().
You can safely drop the stubbing lines and keep only the verify step. This trims noise and avoids a brittle eq(1) / eq(0) matcher that might break if the production code switches to 1.0 (Double) or tags.

-when(meterRegistry.gauge(eq("app_user_engagement_health"), eq(1))).thenReturn(null);
+// no stubbing needed – we only verify the invocation later

(and similarly for eq(0))

Also applies to: 66-68, 92-94

service/src/main/java/greencity/health/DiskSpaceHealthIndicator.java (1)

36-45: Consider re-using a single Gauge instead of calling gauge() every check

meterRegistry.gauge("app_disk_space_health", value) creates a new gauge each invocation, which can leak meters over time. Prefer registering once in the constructor (as done in ResponseTimeMetrics) or by updating a AtomicInteger.

core/src/main/java/greencity/metrics/ResponseTimeMetrics.java (1)

66-71: Null-check guard adds no value inside a servlet container

OncePerRequestFilter is only invoked with non-null request, response, and filterChain; the explicit null guard adds overhead without providing extra safety. Consider removing validateParameters() to streamline the hot path.

core/src/test/java/greencity/metrics/MemoryMetricsTest.java (2)

69-80: Consider extracting reflection utility methods.

The reflection code for accessing private fields appears in multiple places throughout the test class. Consider extracting these operations into utility methods to improve readability and reduce duplication.

+ private <T> T getPrivateField(Object object, String fieldName, Class<T> fieldType) throws NoSuchFieldException, IllegalAccessException {
+     Field field = object.getClass().getDeclaredField(fieldName);
+     field.setAccessible(true);
+     return fieldType.cast(field.get(object));
+ }

82-94: Verify gauge updates properly.

The test correctly verifies that the gauges are registered with the meter registry. However, it only checks that the gauges are registered, not that they return the expected values. Consider enhancing this test to verify the actual values returned by the gauges.

service/src/main/java/greencity/health/NotificationServiceHealthIndicator.java (1)

34-36: Consider using UTC for time calculations.

Using ZoneId.systemDefault() can lead to inconsistent behavior when deploying across different environments with different time zones. Consider using UTC for more consistent time-based calculations.

-            LocalDateTime last24HoursLocal = LocalDateTime.now().minusHours(24);
-            ZonedDateTime last24Hours = last24HoursLocal.atZone(ZoneId.systemDefault());
+            ZonedDateTime now = ZonedDateTime.now(ZoneId.of("UTC"));
+            ZonedDateTime last24Hours = now.minusHours(24);
service/src/test/java/greencity/health/DiskSpaceHealthIndicatorTest.java (1)

56-76: Add test for threshold boundary values.

The test cases cover scenarios well above and well below the threshold, but consider adding a test at exactly the threshold boundary to ensure that edge cases are handled correctly.

For a complete test suite, add a test case for:

  • Free space exactly at the 10% threshold (100L free space with 1000L total)
  • Free space just below the threshold (99L free space with 1000L total)
  • Free space just above the threshold (101L free space with 1000L total)
core/src/main/java/greencity/metrics/RequestCountMetrics.java (3)

74-86: Consider adding logging for request ID generation.

When generating new request IDs, consider adding debug logging to help with troubleshooting in production environments.

 private String ensureRequestId(HttpServletRequest request, HttpServletResponse response) {
     String requestId = request.getHeader(REQUEST_ID_HEADER);
     if (requestId == null || requestId.isEmpty()) {
         requestId = UUID.randomUUID().toString();
         response.setHeader(REQUEST_ID_HEADER, requestId);
         request.setAttribute(REQUEST_ID_HEADER, requestId);
+        logger.debug("Generated new request ID: {}", requestId);
     }
     return requestId;
 }

155-159: Add safeguard against potential division by zero.

While the code checks if total is 0, it's good practice to make this more explicit and handle potential arithmetic exceptions.

 private double getErrorRate() {
     int total = totalRequests.getCount();
     int failed = failedRequests.getCount();
-    return total == 0 ? 0.0 : (double) failed / total * 100;
+    return total > 0 ? Math.min((double) failed / total * 100, 100.0) : 0.0;
 }

This change also caps the error rate at 100% as a safeguard against potential inconsistencies in the counters.


172-176: Consider adding batch operations for metrics updates.

The increment method adds a timestamp and increments the count atomically, but then immediately cleans up old timestamps. For high-traffic scenarios, this could create a lot of overhead.

Consider cleaning up old timestamps periodically rather than on every increment:

 void increment(Instant timestamp) {
     timestamps.add(timestamp);
     count.incrementAndGet();
-    cleanupOldTimestamps(timestamp);
+    // Cleanup old timestamps periodically instead of on every increment
+    if (timestamps.size() % 100 == 0) {
+        cleanupOldTimestamps(timestamp);
+    }
 }
service/src/main/java/greencity/health/HabitTrackingHealthIndicator.java (2)

53-70: Consider refactoring duplicate health response building logic

The code for building health responses has similar patterns in both the success and failure paths. Consider extracting a helper method to build the health response with common details, which would reduce duplication and make future maintenance easier.

@Override
public Health health() {
    try {
        LocalDateTime last24Hours = LocalDateTime.now().minusHours(24);
        ZonedDateTime zonedLast24Hours = last24Hours.atZone(ZoneId.systemDefault());

        Query newHabitsQuery = entityManager.createNativeQuery(
            "SELECT COUNT(*) FROM habits WHERE created_at >= :last24Hours");
        newHabitsQuery.setParameter("last24Hours", last24Hours);
        long newHabitsCount = ((Number) newHabitsQuery.getSingleResult()).longValue();

        Query newAssignsQuery = entityManager.createNativeQuery(
            "SELECT COUNT(*) FROM habit_assign WHERE create_date >= :last24Hours");
        newAssignsQuery.setParameter("last24Hours", zonedLast24Hours);
        long newAssignsCount = ((Number) newAssignsQuery.getSingleResult()).longValue();

        long totalActivityCount = newHabitsCount + newAssignsCount;

-       if (totalActivityCount >= minHabitUpdates) {
-           meterRegistry.gauge("app_habit_tracking_health", 1);
-           return Health.up()
-               .withDetail("habitTracking", "Habit tracking is active")
-               .withDetail("newHabitsLast24h", newHabitsCount)
-               .withDetail("newAssignsLast24h", newAssignsCount)
-               .withDetail("totalActivityLast24h", totalActivityCount)
-               .build();
-       } else {
-           meterRegistry.gauge("app_habit_tracking_health", 0);
-           return Health.outOfService()
-               .withDetail("habitTracking", "Low habit tracking activity")
-               .withDetail("newHabitsLast24h", newHabitsCount)
-               .withDetail("newAssignsLast24h", newAssignsCount)
-               .withDetail("totalActivityLast24h", totalActivityCount)
-               .withDetail("minThreshold", minHabitUpdates)
-               .build();
-       }
+       boolean isHealthy = totalActivityCount >= minHabitUpdates;
+       meterRegistry.gauge("app_habit_tracking_health", isHealthy ? 1 : 0);
+       
+       Health.Builder builder = isHealthy ? 
+           Health.up().withDetail("habitTracking", "Habit tracking is active") :
+           Health.outOfService().withDetail("habitTracking", "Low habit tracking activity");
+           
+       builder.withDetail("newHabitsLast24h", newHabitsCount)
+              .withDetail("newAssignsLast24h", newAssignsCount)
+              .withDetail("totalActivityLast24h", totalActivityCount);
+              
+       if (!isHealthy) {
+           builder.withDetail("minThreshold", minHabitUpdates);
+       }
+       
+       return builder.build();
    } catch (Exception e) {
        logger.error("Habit tracking health check failed: {}", e.getMessage());
        meterRegistry.gauge("app_habit_tracking_health", 0);
        return Health.down()
            .withDetail("habitTracking", "Error checking habit tracking")
            .withDetail("error", e.getMessage())
            .build();
    }
}

41-44: Consider using JPQL instead of native SQL queries

Using native SQL queries couples your code to the specific database schema. JPQL would provide better maintainability and database independence.

For example:

Query newHabitsQuery = entityManager.createQuery(
    "SELECT COUNT(h) FROM Habit h WHERE h.createdAt >= :last24Hours");
core/src/main/java/greencity/metrics/ActiveUsersInMemoryMetrics.java (1)

83-85: Reconsider the login count calculation method

The current implementation of getUserLoginsCount() sums the sizes of all login timestamp lists, which counts repeated logins from the same user multiple times. Consider if you want unique user login count instead.

private int getUserLoginsCount() {
-   return userLogins.values().stream().mapToInt(List::size).sum();
+   // If you want unique user logins:
+   return userLogins.size();
+   // Or if you want total logins but need to handle empty lists:
+   // return userLogins.values().stream().filter(list -> !list.isEmpty()).mapToInt(List::size).sum();
}
core/src/main/java/greencity/config/SecurityConfig.java (1)

131-132: Document filter ordering rationale

The position of filters in the security chain is important. Consider adding comments explaining why ActiveUsersInMemoryMetrics is placed after AccessTokenAuthenticationFilter and why RequestCountMetrics is placed before it.

+ // Place ActiveUsersInMemoryMetrics after auth filter to ensure we only track authenticated users
.addFilterAfter(new ActiveUsersInMemoryMetrics(meterRegistry), AccessTokenAuthenticationFilter.class)
+ // Place RequestCountMetrics before auth filter to capture all requests including authentication failures
.addFilterBefore(requestCountMetrics, AccessTokenAuthenticationFilter.class)
service/src/main/java/greencity/health/UserEngagementHealthIndicator.java (1)

29-49: Consider initializing the gauge once instead of repeatedly.

The gauge initialization in each condition branch could potentially create multiple gauges with the same name or cause memory leaks if called frequently.

Consider initializing the gauge once in the constructor and updating its value in the health method:

@Component
public class UserEngagementHealthIndicator implements HealthIndicator {
    private static final Logger logger = LoggerFactory.getLogger(UserEngagementHealthIndicator.class);
    private final UserRepo userRepo;
    private final MeterRegistry meterRegistry;
    private final int minUserThreshold;
+   private final AtomicInteger healthGauge;

    @Autowired
    public UserEngagementHealthIndicator(UserRepo userRepo, MeterRegistry meterRegistry,
        @Value("${user.engagement.min.threshold:5}") int minUserThreshold) {
        this.userRepo = userRepo;
        this.meterRegistry = meterRegistry;
        this.minUserThreshold = minUserThreshold;
+       this.healthGauge = meterRegistry.gauge("app_user_engagement_health", new AtomicInteger(0));
    }

    @Override
    public Health health() {
        try {
            LocalDateTime last24Hours = LocalDateTime.now().minusHours(24);
            long newUsers = userRepo.countByDateOfRegistrationAfter(last24Hours);

            if (newUsers >= minUserThreshold) {
-               meterRegistry.gauge("app_user_engagement_health", 1);
+               healthGauge.set(1);
                return Health.up()
                    .withDetail("userEngagement", "User engagement is normal")
                    .withDetail("newUsersLast24h", newUsers)
                    .build();
            } else {
-               meterRegistry.gauge("app_user_engagement_health", 0);
+               healthGauge.set(0);
                return Health.outOfService()
                    .withDetail("userEngagement", "Low user engagement detected")
                    .withDetail("newUsersLast24h", newUsers)
                    .withDetail("minThreshold", minUserThreshold)
                    .build();
            }

Note: Don't forget to import java.util.concurrent.atomic.AtomicInteger if you implement this change.

service/src/test/java/greencity/health/MemoryUsageHealthIndicatorTest.java (1)

36-46: Consider improving testability instead of using reflection.

While reflection works for testing, it makes the tests more brittle and harder to maintain.

Consider modifying the MemoryUsageHealthIndicator class to accept an optional MemoryMXBean parameter in the constructor, defaulting to ManagementFactory.getMemoryMXBean() if not provided:

// In MemoryUsageHealthIndicator.java
public class MemoryUsageHealthIndicator implements HealthIndicator {
    private final MeterRegistry meterRegistry;
    private final MemoryMXBean memoryMXBean;

-   public MemoryUsageHealthIndicator(MeterRegistry meterRegistry) {
+   public MemoryUsageHealthIndicator(MeterRegistry meterRegistry, MemoryMXBean memoryMXBean) {
        this.meterRegistry = meterRegistry;
-       this.memoryMXBean = ManagementFactory.getMemoryMXBean();
+       this.memoryMXBean = memoryMXBean;
    }

+   public MemoryUsageHealthIndicator(MeterRegistry meterRegistry) {
+       this(meterRegistry, ManagementFactory.getMemoryMXBean());
+   }
}

// Then in the test:
@BeforeEach
void setUp() {
-   healthIndicator = new MemoryUsageHealthIndicator(meterRegistry);
-   try {
-       java.lang.reflect.Field field = MemoryUsageHealthIndicator.class.getDeclaredField("memoryMXBean");
-       field.setAccessible(true);
-       field.set(healthIndicator, memoryMXBean);
-   } catch (Exception e) {
-       throw new RuntimeException("Failed to set MemoryMXBean for testing", e);
-   }
+   healthIndicator = new MemoryUsageHealthIndicator(meterRegistry, memoryMXBean);
}
core/src/test/java/greencity/metrics/RequestCountMetricsTest.java (1)

118-126: Make gauge-name assertions order-independent to reduce flakiness

Gauge.builder invocations are captured in a list whose order is not guaranteed by Micrometer’s internal registration logic.
Asserting by position (get(0)…get(4)) risks intermittent failures if the production code is refactored but still registers the same five gauges.

Consider asserting on containment instead:

List<String> names = nameCaptor.getAllValues();
assertTrue(names.containsAll(List.of(
        "http_requests_total_per_hour",
        "http_requests_successful_per_hour",
        "http_requests_failed_per_hour",
        "http_requests_prometheus_per_hour",
        "app_error_rate_per_hour")));
assertEquals(5, names.size());

This keeps the test stable while still guaranteeing that all required gauges are registered.

core/src/test/java/greencity/metrics/ActiveUsersInMemoryMetricsTest.java (1)

182-191: Consider providing a package-private helper instead of reflection in tests

setActiveUser manipulates the activeUsers map through reflection.
A small, package-private method (e.g., void putActiveUserForTest(String, Instant)) inside ActiveUsersInMemoryMetrics would:

  • Eliminate reflection complexity.
  • Preserve encapsulation at the public API level.
  • Fail fast at compile-time if the internal representation changes.

Reflection should be the last resort for test setup.

service/src/test/java/greencity/health/HabitTrackingHealthIndicatorTest.java (1)

71-72: Use double literal when verifying gauge values

MeterRegistry#gauge(String, double) expects a double. Verifying with an int works through widening, but Mockito performs strict type matching on the boxed value, which may lead to false-negatives.

-verify(meterRegistry).gauge("app_habit_tracking_health", 1);
+verify(meterRegistry).gauge("app_habit_tracking_health", 1.0);

Apply the same change for the 0 verifications. This tiny tweak prevents brittle test failures.

Also applies to: 94-95, 110-111, 133-134

core/src/main/java/greencity/metrics/GarbageCollectionMetrics.java (2)

58-60: Potential unbounded growth of history queues

A new record is added on every scrape, and only entries older than 1 hour are removed.
If Prometheus scrapes every 5 seconds, each queue can hold up to 3600 / 5 = 720 elements per collector, which is acceptable.
However, for sub-second scrapes or accidental mis-configuration, the queues could grow unbounded.

Consider guarding against pathologically high scrape rates:

private static final int MAX_SAMPLES = 1000; // safety cap
...
if (history.size() > MAX_SAMPLES) {
    history.poll();
}

This defensive cap prevents memory bloat while preserving the 1-hour window for normal scrape intervals.


34-48: Minor: unused lambda parameter

Gauge.builder(..., this, metrics -> getGCTimePerHour(gcBean)) supplies a lambda argument metrics that is never used.
You can replace the lambda with a method reference for clarity:

-Gauge.builder("app_gc_time_ms_per_hour", this, metrics -> getGCTimePerHour(gcBean))
+Gauge.builder("app_gc_time_ms_per_hour", this, unused -> getGCTimePerHour(gcBean))

(or simply obj -> …). Purely cosmetic, but it signals intent more clearly.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc81259 and e642880.

📒 Files selected for processing (36)
  • core/pom.xml (1 hunks)
  • core/src/main/java/greencity/config/SecurityConfig.java (5 hunks)
  • core/src/main/java/greencity/controller/MetricsController.java (1 hunks)
  • core/src/main/java/greencity/dto/metric/LoginEventDto.java (1 hunks)
  • core/src/main/java/greencity/metrics/ActiveUsersInMemoryMetrics.java (1 hunks)
  • core/src/main/java/greencity/metrics/CPUMetrics.java (1 hunks)
  • core/src/main/java/greencity/metrics/GarbageCollectionMetrics.java (1 hunks)
  • core/src/main/java/greencity/metrics/MemoryMetrics.java (1 hunks)
  • core/src/main/java/greencity/metrics/RequestCountMetrics.java (1 hunks)
  • core/src/main/java/greencity/metrics/ResponseTimeMetrics.java (1 hunks)
  • core/src/main/resources/application.properties (1 hunks)
  • core/src/test/java/greencity/controller/MetricsControllerTest.java (1 hunks)
  • core/src/test/java/greencity/metrics/ActiveUsersInMemoryMetricsTest.java (1 hunks)
  • core/src/test/java/greencity/metrics/CPUMetricsTest.java (1 hunks)
  • core/src/test/java/greencity/metrics/GarbageCollectionMetricsTest.java (1 hunks)
  • core/src/test/java/greencity/metrics/MemoryMetricsTest.java (1 hunks)
  • core/src/test/java/greencity/metrics/RequestCountMetricsTest.java (1 hunks)
  • core/src/test/java/greencity/metrics/ResponseTimeMetricsTest.java (1 hunks)
  • dao/src/main/java/greencity/repository/NotificationRepo.java (2 hunks)
  • dao/src/main/java/greencity/repository/UserRepo.java (1 hunks)
  • docker-compose.yml (1 hunks)
  • prometheus.yml (1 hunks)
  • service-api/src/main/java/greencity/constant/ErrorMessage.java (1 hunks)
  • service/pom.xml (1 hunks)
  • service/src/main/java/greencity/health/DatabasePerformanceHealthIndicator.java (1 hunks)
  • service/src/main/java/greencity/health/DiskSpaceHealthIndicator.java (1 hunks)
  • service/src/main/java/greencity/health/HabitTrackingHealthIndicator.java (1 hunks)
  • service/src/main/java/greencity/health/MemoryUsageHealthIndicator.java (1 hunks)
  • service/src/main/java/greencity/health/NotificationServiceHealthIndicator.java (1 hunks)
  • service/src/main/java/greencity/health/UserEngagementHealthIndicator.java (1 hunks)
  • service/src/test/java/greencity/health/DatabasePerformanceHealthIndicatorTest.java (1 hunks)
  • service/src/test/java/greencity/health/DiskSpaceHealthIndicatorTest.java (1 hunks)
  • service/src/test/java/greencity/health/HabitTrackingHealthIndicatorTest.java (1 hunks)
  • service/src/test/java/greencity/health/MemoryUsageHealthIndicatorTest.java (1 hunks)
  • service/src/test/java/greencity/health/NotificationServiceHealthIndicatorTest.java (1 hunks)
  • service/src/test/java/greencity/health/UserEngagementHealthIndicatorTest.java (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
dao/src/main/java/greencity/repository/NotificationRepo.java (1)
dao/src/main/java/greencity/repository/impl/CustomNotificationRepoImpl.java (1)
  • Repository (25-119)
service/src/main/java/greencity/health/DiskSpaceHealthIndicator.java (5)
service/src/main/java/greencity/health/MemoryUsageHealthIndicator.java (1)
  • Component (13-58)
service/src/main/java/greencity/health/HabitTrackingHealthIndicator.java (1)
  • Component (18-80)
service/src/main/java/greencity/health/DatabasePerformanceHealthIndicator.java (1)
  • Component (14-48)
service/src/main/java/greencity/health/NotificationServiceHealthIndicator.java (1)
  • Component (16-61)
service/src/main/java/greencity/health/UserEngagementHealthIndicator.java (1)
  • Component (14-58)
core/src/main/java/greencity/metrics/GarbageCollectionMetrics.java (4)
core/src/main/java/greencity/metrics/ResponseTimeMetrics.java (1)
  • Component (19-126)
core/src/main/java/greencity/metrics/RequestCountMetrics.java (1)
  • Component (22-238)
core/src/main/java/greencity/metrics/CPUMetrics.java (1)
  • Component (10-34)
core/src/main/java/greencity/metrics/MemoryMetrics.java (1)
  • Component (10-43)
🪛 YAMLlint (1.35.1)
prometheus.yml

[error] 8-8: no new line character at the end of file

(new-line-at-end-of-file)

docker-compose.yml

[error] 27-27: no new line character at the end of file

(new-line-at-end-of-file)

🔇 Additional comments (40)
service-api/src/main/java/greencity/constant/ErrorMessage.java (1)

249-249: Good addition for error handling in the metrics filter.

This error message constant will help provide clear feedback when the ResponseTimeMetrics filter encounters null parameters, supporting a more robust implementation.

dao/src/main/java/greencity/repository/UserRepo.java (1)

928-928: Well-defined repository method for user engagement metrics.

This method follows Spring Data JPA naming conventions and will efficiently count users registered after a specified date. It's a good addition to support the new UserEngagementHealthIndicator component.

core/pom.xml (1)

266-277: Good integration of monitoring dependencies.

The addition of these three dependencies enables comprehensive application monitoring:

  • micrometer-registry-prometheus provides Prometheus metrics integration
  • aspectjweaver supports aspect-oriented programming for metrics capture
  • spring-boot-starter-actuator exposes health and metrics endpoints

These components form a solid foundation for the monitoring infrastructure being implemented.

service/pom.xml (1)

63-70: Appropriate monitoring dependencies for service module.

Adding the Micrometer Prometheus registry and Spring Boot Actuator dependencies to the service module ensures consistent monitoring capabilities across modules. This complements the same dependencies added to the core module.

core/src/test/java/greencity/controller/MetricsControllerTest.java (2)

1-33: Good test setup with well-structured initialization.

The test class has proper structure with clean separation of mocks and test data initialization in the setUp method. The testing strategy is clear and follows best practices.


34-44: Well-written test for the happy path.

This test correctly verifies the happy path scenario with appropriate assertions and verification of service method calls.

core/src/test/java/greencity/metrics/CPUMetricsTest.java (1)

1-95: Well-structured and comprehensive test class!

The test class provides thorough coverage of the CPUMetrics component with appropriate test cases for constructor initialization, gauge registration, and CPU usage calculation. The reflection approach to access private fields and methods is handled correctly, and the test cases appropriately verify both positive and negative CPU load scenarios.

service/src/test/java/greencity/health/UserEngagementHealthIndicatorTest.java (1)

32-55: Good test structure and use of MockedStatic

Nicely fixes the clock, verifies repository interaction, and checks the health details – solid, readable test!

core/src/main/java/greencity/metrics/ResponseTimeMetrics.java (1)

29-50: Excellent metric registration

Good use of Timer with percentiles and separate gauges for min/max/avg – clear and Prometheus-friendly.

core/src/test/java/greencity/metrics/MemoryMetricsTest.java (3)

48-67: Good test setup with well-structured mocking.

The test setup correctly mocks the gauge builders with their fluent API and properly injects the mocked MemoryMXBean via reflection. This approach ensures that the test isolates the component's behavior from its dependencies.


96-108: Well-structured test for a private method.

While testing private methods directly is sometimes debated, this approach using reflection provides good coverage of the internal logic. The test clearly sets up expectations and verifies the correct calculation of used memory.


110-120: Good boundary value testing.

The test correctly verifies the total memory calculation. Consider adding a test case where getMax() returns -1 (which can happen in some JVM configurations) to ensure the method handles this edge case correctly.

service/src/test/java/greencity/health/DiskSpaceHealthIndicatorTest.java (2)

34-54: Well-structured test for UP status.

This test case thoroughly verifies all aspects of the health check when disk space is sufficient. The expected values are clearly defined, and all relevant conditions and interactions are verified.


78-93: Good exception handling test.

The test properly verifies the behavior when an exception occurs, including checking that the gauge is set to 0 and that the appropriate error details are included. The verify(root, never()).getFreeSpace() assertion is a nice touch that confirms the exception short-circuits the execution.

core/src/main/java/greencity/metrics/RequestCountMetrics.java (1)

22-49: Well-structured metrics implementation with clear metric definitions.

The setup of different metric types and their descriptions is comprehensive and follows good practices. The time window constant and processedRequests map with its maximum size limit are good design choices.

service/src/main/java/greencity/health/HabitTrackingHealthIndicator.java (2)

35-79: Good exception handling and metric reporting

The implementation catches all exceptions, properly logs them, sets the health gauge to 0, and returns a "down" health status with error details. This provides good visibility into failure scenarios.


38-40:

Details

❓ Verification inconclusive

Be cautious with date type mismatch in queries

In your implementation, you're using LocalDateTime for the newHabitsQuery and ZonedDateTime for the newAssignsQuery. This mismatch might cause inconsistent time-based results when querying different tables.


🏁 Script executed:

#!/bin/bash
# Check database schema to verify date types in both habit tables
rg -A 2 "CREATE TABLE (habits|habit_assign)" --glob "*.sql"

Length of output: 61


🏁 Script executed:

#!/bin/bash
# Locate where the two query parameters are used in Java code
rg "newHabitsQuery" -n --glob "*.java"
rg "newAssignsQuery" -n --glob "*.java"
rg "last24Hours" -n --glob "*.java"
rg "zonedLast24Hours" -n --glob "*.java"

Length of output: 10954


Confirm consistent date parameter types in native queries

I didn’t find any schema files in this module to verify whether habits.created_at and habit_assign.create_date are TIMESTAMP or TIMESTAMP WITH TIME ZONE. Please double-check your database column types and ensure you bind the correct Java type:

• In HabitTrackingHealthIndicator.java

  • Line 43: newHabitsQuery.setParameter("last24Hours", last24Hours);
  • Line 48: newAssignsQuery.setParameter("last24Hours", zonedLast24Hours);

Recommended steps:

  1. Verify in your DB (or migration scripts) whether each column is with or without timezone.
  2. Use the matching Java type for both queries (e.g. both as LocalDateTime for TIMESTAMP or both as ZonedDateTime/OffsetDateTime for timezone-aware columns).
  3. If you need to normalize across tables, consider converting to UTC Instant or consistently applying ZoneId.systemDefault() before binding.

Let me know once the column types are confirmed and the parameters are unified!

core/src/main/java/greencity/metrics/ActiveUsersInMemoryMetrics.java (1)

36-44: Good practice: Using descriptive gauge names and metadata

The gauge definitions include descriptive names, documentation strings, and appropriate base units. This makes the metrics more understandable when viewed in monitoring systems.

core/src/main/java/greencity/config/SecurityConfig.java (1)

3-10: Good practice: Clearly defined imports for metrics components

The imports for the new metrics components are clearly organized at the top of the file. This helps maintain readability in a large configuration class.

core/src/test/java/greencity/metrics/ResponseTimeMetricsTest.java (3)

113-129: Good test coverage for error handling scenarios

The tests thoroughly check that the filter properly handles null parameters and throws the expected exceptions with the correct error messages. This helps ensure the component is robust against invalid inputs.


131-156: Excellent testing of exception handling in the filter chain

The test verifies that even when an exception occurs in the filter chain, response time metrics are still recorded correctly. This helps ensure monitoring isn't lost during error scenarios.


101-111: Good baseline test for zero requests

Testing the behavior when no requests have been processed ensures the metrics report sensible values (zeros) in this case, avoiding potential divide-by-zero or null pointer issues in production.

core/src/test/java/greencity/metrics/GarbageCollectionMetricsTest.java (5)

1-27: Well-organized imports and setup for GC metrics testing.

The test class has a clear structure with properly organized imports and effective use of Mockito for testing. The BeforeEach setup method properly initializes mock GC beans with realistic names.


49-75: Good use of static mocking to test metric initialization.

The test effectively uses MockedStatic to control Instant.now(), ManagementFactory, and Gauge.builder(), enabling isolated verification of gauge registration. You're properly verifying that both GC time and count gauges are registered with the correct tags for each collector.


77-108: Thorough testing of GC time calculation with edge cases.

Good job testing the GC time per hour calculation, including the first call (returning 0), subsequent calls with deltas, and the special case when collection time is -1 (unsupported). The use of reflection to test private methods is appropriate here.


110-137: Appropriate validation of GC count metrics.

The test correctly verifies the GC count per hour calculation, confirming both initial zero values and subsequent delta calculations. Your assertions include informative error messages that will make test failures easier to diagnose.


139-168: Effective testing of record cleanup functionality.

The test properly verifies that old GC records are cleaned up after the retention window, ensuring the metrics remain relevant. The two-hour time difference between records is a good choice to validate the cleanup mechanism.

service/src/test/java/greencity/health/NotificationServiceHealthIndicatorTest.java (5)

1-20: Clean and comprehensive import organization.

The test imports are well-organized, properly including all necessary Mockito and Spring Boot testing dependencies. Good job separating static imports for better readability.


21-33: Clear test setup with proper mocking and threshold definition.

The test class is well-structured with appropriate mocks for NotificationRepo and MeterRegistry. Setting a constant threshold value makes the tests more readable and maintainable.


34-58: Comprehensive testing of UP health status.

This test case thoroughly verifies the health indicator's behavior when notification count exceeds the threshold, including checking status, details, and metric gauge value. The time mocking ensures consistent time-based calculations.


60-85: Well-structured test for OUT_OF_SERVICE condition.

The test properly validates the health status when notification count is below threshold, checking both the status and all expected health details including the threshold value itself. Good verification of metric gauge value being set to 0.


87-112: Effective error handling test.

This test appropriately verifies the health indicator's behavior when the repository throws an exception, confirming DOWN status, proper error details, and metric gauge value updates. Good verification of exception message propagation.

service/src/main/java/greencity/health/UserEngagementHealthIndicator.java (3)

1-13: Well-organized imports for health indicator component.

The imports are clean and appropriately organized, including necessary Spring Boot health indicator imports and logging dependencies.


14-28: Good component structure with configurable threshold.

The class properly implements HealthIndicator with appropriate dependency injection and configurable threshold via @value annotation. The default threshold of 5 provides sensible behavior without explicit configuration.


49-58: Good error handling with appropriate logging.

The exception handling is well implemented with proper logging and meaningful health details. The health metric is correctly set to 0 when errors occur, indicating unhealthy status.

service/src/test/java/greencity/health/MemoryUsageHealthIndicatorTest.java (4)

1-18: Well-organized imports for memory usage health testing.

The imports are clean and properly organized, including all necessary testing and mocking dependencies.


48-73: Thorough testing of healthy memory state.

The test effectively verifies the health status is UP when memory usage is below threshold, with proper verification of health details and gauge metric registration. The calculations and verifications are comprehensive.


75-100: Good validation of high memory usage condition.

The test properly verifies the health indicator returns DOWN status when memory usage exceeds threshold, with appropriate details and metrics. The verification of dependency interactions is thorough.


102-118: Effective exception handling test.

The test appropriately verifies the health indicator's behavior when an exception occurs during memory usage retrieval, confirming DOWN status and proper error details. Good verification of exception message propagation.

service/src/test/java/greencity/health/DatabasePerformanceHealthIndicatorTest.java (1)

49-55: Use double literals when stubbing/verifying MeterRegistry#gauge

MeterRegistry#gauge(String, double) expects a primitive double.
Passing an int literal through eq(1) introduces an unnecessary autobox/unbox/widen step and slightly obscures intent.
Switching to 1.0 (or 1d) keeps the type explicit and avoids Mockito/Java’s silent conversions.

-when(meterRegistry.gauge(eq("app_database_health"), eq(1))).thenReturn(null);
+when(meterRegistry.gauge(eq("app_database_health"), eq(1.0))).thenReturn(null);

-verify(meterRegistry).gauge("app_database_health", 1);
+verify(meterRegistry).gauge("app_database_health", 1.0);

(The same change applies to the DOWN-status tests below.)

[ suggest_nitpick ]
[ request_verification ]

Also applies to: 60-66

Comment on lines +1 to +8
global:
scrape_interval: 15s

scrape_configs:
- job_name: 'GreenCity-Monitoring'
metrics_path: '/actuator/prometheus'
static_configs:
- targets: ['host.docker.internal:8080'] #замінити на адресу нашого додатку на Azure No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Configuration looks good, but needs a few adjustments.

The Prometheus configuration has a good structure with appropriate scrape interval and target configuration. However, I noticed a few items that should be addressed:

  1. The comment on line 8 is in Ukrainian. Consider using English for comments to ensure all team members can understand the configuration.

  2. The target uses host.docker.internal which works for local Docker development but won't work in Azure. Make sure to update this before deployment.

Apply this diff to fix the comment and add a new line at the end of the file:

      - targets: ['host.docker.internal:8080'] #замінити на адресу нашого додатку на Azure
+      - targets: ['host.docker.internal:8080'] #replace with our application address on Azure
+

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 8-8: no new line character at the end of file

(new-line-at-end-of-file)

Comment on lines +1 to +27
services:
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
restart: always
networks:
- monitoring

grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
depends_on:
- prometheus
restart: always
networks:
- monitoring
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin

networks:
monitoring:
driver: bridge No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve stability, security, and persistence in Docker configuration.

The Docker Compose setup looks well-structured, but there are several areas for improvement:

  1. Use specific version tags instead of latest to ensure stability and predictability in your deployments.
  2. Avoid hardcoded default credentials for Grafana in the compose file. Consider using environment variables or Docker secrets.
  3. Add persistent volumes for Grafana to preserve dashboards, configurations, and data.
  4. Add a version specification for the Docker Compose file.
+version: '3.8'
 services:
   prometheus:
-    image: prom/prometheus:latest
+    image: prom/prometheus:v2.45.0
     ports:
       - "9090:9090"
     volumes:
       - ./prometheus.yml:/etc/prometheus/prometheus.yml
+      - prometheus_data:/prometheus
     restart: always
     networks:
       - monitoring

   grafana:
-    image: grafana/grafana:latest
+    image: grafana/grafana:10.0.3
     ports:
       - "3000:3000"
     depends_on:
       - prometheus
     restart: always
     networks:
       - monitoring
+    volumes:
+      - grafana_data:/var/lib/grafana
     environment:
-      - GF_SECURITY_ADMIN_USER=admin
-      - GF_SECURITY_ADMIN_PASSWORD=admin
+      - GF_SECURITY_ADMIN_USER=${GRAFANA_ADMIN_USER:-admin}
+      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:-admin}

+volumes:
+  prometheus_data:
+  grafana_data:
+
 networks:
   monitoring:
     driver: bridge
+
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
services:
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
restart: always
networks:
- monitoring
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
depends_on:
- prometheus
restart: always
networks:
- monitoring
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
networks:
monitoring:
driver: bridge
version: '3.8'
services:
prometheus:
image: prom/prometheus:v2.45.0
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
restart: always
networks:
- monitoring
grafana:
image: grafana/grafana:10.0.3
ports:
- "3000:3000"
depends_on:
- prometheus
restart: always
networks:
- monitoring
volumes:
- grafana_data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_USER=${GRAFANA_ADMIN_USER:-admin}
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD:-admin}
volumes:
prometheus_data:
grafana_data:
networks:
monitoring:
driver: bridge
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 27-27: no new line character at the end of file

(new-line-at-end-of-file)

Comment on lines +12 to +23
@RestController
@RequestMapping("/metrics")
@RequiredArgsConstructor
public class MetricsController {
private final ActiveUsersInMemoryMetrics activeUsersInMemoryMetrics;

@PostMapping("/recordLogin")
public ResponseEntity<Void> recordLogin(@RequestBody LoginEventDto loginEventDto) {
activeUsersInMemoryMetrics.recordLogin(loginEventDto.getEmail(), loginEventDto.getLoginTime());
return ResponseEntity.ok().build();
}
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation, exception handling, and API documentation.

The controller implementation is clean and focused, but could benefit from several improvements:

  1. Add input validation to ensure the LoginEventDto contains valid data
  2. Implement exception handling to gracefully manage potential errors
  3. Add API documentation for better visibility and understanding
  4. Consider security implications of this endpoint
+import javax.validation.Valid;
+import io.swagger.annotations.Api;
+import io.swagger.annotations.ApiOperation;
+import io.swagger.annotations.ApiResponse;
+import io.swagger.annotations.ApiResponses;

 @RestController
 @RequestMapping("/metrics")
 @RequiredArgsConstructor
+@Api(tags = "Metrics", description = "Endpoints for application metrics")
 public class MetricsController {
     private final ActiveUsersInMemoryMetrics activeUsersInMemoryMetrics;

+    @ApiOperation(value = "Record user login event", notes = "Captures login events for metrics tracking")
+    @ApiResponses(value = {
+        @ApiResponse(code = 200, message = "Login event successfully recorded"),
+        @ApiResponse(code = 400, message = "Invalid input data"),
+        @ApiResponse(code = 500, message = "Internal server error")
+    })
     @PostMapping("/recordLogin")
-    public ResponseEntity<Void> recordLogin(@RequestBody LoginEventDto loginEventDto) {
-        activeUsersInMemoryMetrics.recordLogin(loginEventDto.getEmail(), loginEventDto.getLoginTime());
-        return ResponseEntity.ok().build();
+    public ResponseEntity<Void> recordLogin(@Valid @RequestBody LoginEventDto loginEventDto) {
+        try {
+            activeUsersInMemoryMetrics.recordLogin(loginEventDto.getEmail(), loginEventDto.getLoginTime());
+            return ResponseEntity.ok().build();
+        } catch (Exception e) {
+            // Log the exception
+            return ResponseEntity.internalServerError().build();
+        }
     }
 }

Comment on lines +26 to +57
@Override
public Health health() {
try {
long usedMemory =
memoryMXBean.getHeapMemoryUsage().getUsed() + memoryMXBean.getNonHeapMemoryUsage().getUsed();
long maxMemory = memoryMXBean.getHeapMemoryUsage().getMax();
double memoryUsagePercentage = (double) usedMemory / maxMemory * 100;

if (memoryUsagePercentage <= MAX_MEMORY_USAGE_PERCENTAGE) {
meterRegistry.gauge("app_memory_usage_health", 1);
return Health.up()
.withDetail("memoryUsage", "Memory usage is within safe limits")
.withDetail("memoryUsagePercentage", memoryUsagePercentage)
.withDetail("usedMemoryBytes", usedMemory)
.build();
} else {
meterRegistry.gauge("app_memory_usage_health", 0);
return Health.down()
.withDetail("memoryUsage", "High memory usage detected")
.withDetail("memoryUsagePercentage", memoryUsagePercentage)
.withDetail("usedMemoryBytes", usedMemory)
.build();
}
} catch (Exception e) {
logger.error("Memory usage health check failed: {}", e.getMessage());
meterRegistry.gauge("app_memory_usage_health", 0);
return Health.down()
.withDetail("memoryUsage", "Error checking memory usage")
.withDetail("error", e.getMessage())
.build();
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve gauge implementation and memory calculation.

The current implementation has several areas for improvement:

  1. The gauge is recreated on each health check, which can cause issues with Micrometer's gauge functionality
  2. The memory usage calculation could be more accurate
  3. There's no protection against division by zero
  4. Consider exposing the actual percentage rather than just a binary health state
 @Component
 public class MemoryUsageHealthIndicator implements HealthIndicator {
     private static final Logger logger = LoggerFactory.getLogger(MemoryUsageHealthIndicator.class);
     private final MeterRegistry meterRegistry;
     private final MemoryMXBean memoryMXBean;
-    private static final double MAX_MEMORY_USAGE_PERCENTAGE = 80.0;
+    @Value("${health.memory.max-usage-percentage:80.0}")
+    private double maxMemoryUsagePercentage;
+    private final AtomicReference<Double> memoryHealthGauge = new AtomicReference<>(1.0);

     @Autowired
     public MemoryUsageHealthIndicator(MeterRegistry meterRegistry) {
         this.meterRegistry = meterRegistry;
         this.memoryMXBean = ManagementFactory.getMemoryMXBean();
+        // Register gauge once at initialization
+        this.meterRegistry.gauge("app_memory_usage_health", memoryHealthGauge, AtomicReference::get);
+        this.meterRegistry.gauge("app_memory_usage_percentage", Tags.empty(), this, indicator -> {
+            try {
+                return calculateMemoryUsagePercentage();
+            } catch (Exception e) {
+                return -1.0;
+            }
+        });
     }

+    private double calculateMemoryUsagePercentage() {
+        long usedHeapMemory = memoryMXBean.getHeapMemoryUsage().getUsed();
+        long maxHeapMemory = memoryMXBean.getHeapMemoryUsage().getMax();
+        
+        // Protect against division by zero
+        if (maxHeapMemory <= 0) {
+            return 0.0;
+        }
+        
+        return (double) usedHeapMemory / maxHeapMemory * 100;
+    }
+
     @Override
     public Health health() {
         try {
-            long usedMemory =
-                memoryMXBean.getHeapMemoryUsage().getUsed() + memoryMXBean.getNonHeapMemoryUsage().getUsed();
-            long maxMemory = memoryMXBean.getHeapMemoryUsage().getMax();
-            double memoryUsagePercentage = (double) usedMemory / maxMemory * 100;
+            double memoryUsagePercentage = calculateMemoryUsagePercentage();
+            long usedHeapMemory = memoryMXBean.getHeapMemoryUsage().getUsed();
+            long usedNonHeapMemory = memoryMXBean.getNonHeapMemoryUsage().getUsed();
+            long totalUsedMemory = usedHeapMemory + usedNonHeapMemory;

-            if (memoryUsagePercentage <= MAX_MEMORY_USAGE_PERCENTAGE) {
-                meterRegistry.gauge("app_memory_usage_health", 1);
+            if (memoryUsagePercentage <= maxMemoryUsagePercentage) {
+                memoryHealthGauge.set(1.0);
                 return Health.up()
                     .withDetail("memoryUsage", "Memory usage is within safe limits")
                     .withDetail("memoryUsagePercentage", memoryUsagePercentage)
-                    .withDetail("usedMemoryBytes", usedMemory)
+                    .withDetail("usedHeapMemoryBytes", usedHeapMemory)
+                    .withDetail("usedNonHeapMemoryBytes", usedNonHeapMemory)
+                    .withDetail("totalUsedMemoryBytes", totalUsedMemory)
                     .build();
             } else {
-                meterRegistry.gauge("app_memory_usage_health", 0);
+                memoryHealthGauge.set(0.0);
                 return Health.down()
                     .withDetail("memoryUsage", "High memory usage detected")
                     .withDetail("memoryUsagePercentage", memoryUsagePercentage)
-                    .withDetail("usedMemoryBytes", usedMemory)
+                    .withDetail("usedHeapMemoryBytes", usedHeapMemory)
+                    .withDetail("usedNonHeapMemoryBytes", usedNonHeapMemory)
+                    .withDetail("totalUsedMemoryBytes", totalUsedMemory)
                     .build();
             }
         } catch (Exception e) {
             logger.error("Memory usage health check failed: {}", e.getMessage());
-            meterRegistry.gauge("app_memory_usage_health", 0);
+            memoryHealthGauge.set(0.0);
             return Health.down()
                 .withDetail("memoryUsage", "Error checking memory usage")
                 .withDetail("error", e.getMessage())
                 .build();
         }
     }

Don't forget to add the imports:

import java.util.concurrent.atomic.AtomicReference;
import io.micrometer.core.instrument.Tags;
import org.springframework.beans.factory.annotation.Value;

Comment on lines +40 to +42
private double getTotalMemory() {
return memoryMXBean.getHeapMemoryUsage().getMax();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve total memory calculation to handle undefined max heap memory

The getMax() method can return -1 if the maximum heap memory isn't defined. Consider adding a check for this case and providing a fallback like using the committed memory instead.

private double getTotalMemory() {
-    return memoryMXBean.getHeapMemoryUsage().getMax();
+    long maxHeapMemory = memoryMXBean.getHeapMemoryUsage().getMax();
+    return maxHeapMemory > 0 ? maxHeapMemory : memoryMXBean.getHeapMemoryUsage().getCommitted();
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private double getTotalMemory() {
return memoryMXBean.getHeapMemoryUsage().getMax();
}
private double getTotalMemory() {
long maxHeapMemory = memoryMXBean.getHeapMemoryUsage().getMax();
return maxHeapMemory > 0
? maxHeapMemory
: memoryMXBean.getHeapMemoryUsage().getCommitted();
}

Comment on lines +45 to +52
@BeforeEach
void setUp() {
meterRegistry = new SimpleMeterRegistry();
filter = new ActiveUsersInMemoryMetrics(meterRegistry);
SecurityContextHolder.setContext(securityContext);
}

@Test
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reset SecurityContextHolder after each test to avoid cross-test leakage

SecurityContextHolder stores context in a thread-local that persists between tests executed on the same thread.
Since the suite sets a custom context in setUp(), subsequent tests might accidentally inherit authentication from a previous run, producing false positives/negatives when run in parallel or out of order.

Add a simple teardown:

@AfterEach
void clearContext() {
    SecurityContextHolder.clearContext();
}

This keeps tests isolated and future-proof.

Comment on lines +58 to +60
when(newHabitsQuery.setParameter(anyString(), any(LocalDateTime.class))).thenReturn(newHabitsQuery);
when(newAssignsQuery.setParameter(anyString(), any(ZonedDateTime.class))).thenReturn(newAssignsQuery);
when(newHabitsQuery.getSingleResult()).thenReturn(5L);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistent parameter types may break the stubbing

setParameter(..., any(LocalDateTime.class)) is stubbed for the habits query, whereas the habit_assign query is stubbed with any(ZonedDateTime.class).
Unless the production code truly passes a ZonedDateTime for one query and a LocalDateTime for the other, the second stub will not match and the call will return null, causing an NPE at runtime.

If the indicator uses LocalDateTime for both, update the stub:

-when(newAssignsQuery.setParameter(anyString(), any(ZonedDateTime.class))).thenReturn(newAssignsQuery);
+when(newAssignsQuery.setParameter(anyString(), any(LocalDateTime.class))).thenReturn(newAssignsQuery);

Please align the stubs with the real method signature to keep the test trustworthy.

Also applies to: 119-120

Comment on lines +42 to +49
try {
java.lang.reflect.Field entityManagerField =
HabitTrackingHealthIndicator.class.getDeclaredField("entityManager");
entityManagerField.setAccessible(true);
entityManagerField.set(habitTrackingHealthIndicator, entityManager);
} catch (Exception e) {
throw new RuntimeException("Failed to set entityManager", e);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Prefer dependency injection over reflection in tests

The test currently relies on reflection to inject the EntityManager. Reflection can silently break when the field name changes and makes the test harder to read.
Mockito can wire the dependency for you:

@ExtendWith(MockitoExtension.class)
class HabitTrackingHealthIndicatorTest {
    @Mock private EntityManager entityManager;
    @Mock private MeterRegistry meterRegistry;

    @InjectMocks
    private HabitTrackingHealthIndicator habitTrackingHealthIndicator =
            new HabitTrackingHealthIndicator(meterRegistry, 10); // threshold
}

With @InjectMocks, Mockito sets the entityManager field directly—no reflection or try/catch needed.
This small shift improves maintainability and removes the unchecked RuntimeException.

Comment on lines +74 to +88
double getGCCountPerHour(GarbageCollectorMXBean gcBean, Instant now) {
ConcurrentLinkedQueue<GCCountRecord> history = gcCountHistory.get(gcBean);
long currentGCCount = gcBean.getCollectionCount();

history.add(new GCCountRecord(now, currentGCCount));
cleanupOldGCCountRecords(gcBean, now);

if (history.size() < 2) {
return 0.0;
}

GCCountRecord oldest = history.peek();
long countDiff = currentGCCount - oldest.gcCount;
return countDiff >= 0 ? countDiff : 0.0;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Apply the same normalisation logic to GC count metric

For symmetry and accuracy, update getGCCountPerHour exactly as suggested for getGCTimePerHour, scaling the countDiff by 3600 / elapsedSeconds. This maintains consistent semantics across both GC metrics.

Comment on lines +54 to +68
double getGCTimePerHour(GarbageCollectorMXBean gcBean, Instant now) {
ConcurrentLinkedQueue<GCTimeRecord> history = gcTimeHistory.get(gcBean);
long currentGCTime = gcBean.getCollectionTime();

history.add(new GCTimeRecord(now, currentGCTime));
cleanupOldGCTimeRecords(gcBean, now);

if (history.size() < 2) {
return 0.0;
}

GCTimeRecord oldest = history.peek();
long timeDiff = currentGCTime - oldest.gcTime;
return timeDiff >= 0 ? timeDiff : 0.0;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Reported “per-hour” GC time is not actually normalised to 1 hour

getGCTimePerHour returns the raw difference between the newest and oldest collectionTime.
If the gauge is scraped every 2 minutes, the oldest record might be only 2 minutes old, so the diff reflects 2 minutes of GC time but is exposed as “per hour”. The value therefore fluctuates with the scrape interval rather than true hourly GC time.

A quick fix is to normalise the delta by the elapsed seconds:

 long timeDiff = currentGCTime - oldest.gcTime;
-long elapsed = Duration.between(oldest.timestamp, now).getSeconds();
-return timeDiff >= 0 ? timeDiff : 0.0;
+long elapsed = Duration.between(oldest.timestamp, now).getSeconds();
+if (elapsed == 0 || timeDiff < 0) {
+    return 0.0;
+}
+return ((double) timeDiff) * 3600 / elapsed; // scale to one hour

The same adjustment is needed in getGCCountPerHour. Without this correction the metric is misleading and hard to alert on.

@ChernenkoVitaliy
Copy link
Contributor

Please, attach the GitHub ticket with the task description to the PR description.

And for the future, use our template for PR desc.

@SashaPog SashaPog self-assigned this Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants